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Abstract —We address the problem of noise and interference 
corrupted channel estimation in massive MIMO systems. Inter¬ 
ference, which originates from pilot reuse (or contamination), 
can in principle be discriminated on the basis of the distributions 
of path angles and amplitudes. In this paper we propose novel 
robust channel estimation algorithms exploiting path diversity in 
both angle and power domains, relying on a suitable combination 
of the spatial filtering and amplitude based projection. The 
proposed approaches are able to cope with a wide range of system 
and topology scenarios. Including those where, unlike in previous 
works, interference channel may overlap with desired channels 
in terms of multipath angles of arrival or exceed them in terms 
of received power. In particular we establish analytically the 
conditions under which the proposed channel estimator is fully 
decontaminated. Simulation results confirm the overall system 
gains when using the new methods. 

Index Terms —massive MIMO, pilot contamination, pilot de¬ 
contamination, channel estimation, covariance, subspace, eigen¬ 
value decomposition 


I. Introduction 

Massive MIMO (also known as Large-Scale Antenna Sys¬ 
tems) introduced in Q, is widely believed to be one of the key 
enablers of the future 5th generation (5G) wireless systems 
thanks to its potential to substantially enhance spectral and 
energy efficiencies 0, 0 compared to traditional MIMO 
with fewer antennas. This technique is based on the law of 
large numbers, which predicts that, as the number of base 
station antennas increases, the vector channel for a desired 
user terminal will grow more orthogonal to the vector channel 
of an interfering user, thus allowing the base station to 
reject interference by precoding, or even, as a low-complexity 
approach, simply aligning the beamforming vector with the 
desired channel (“Maximum Ratio Combining”, or MRC), 
providing that Channel State Information (CSI) is known at 
base station. In practice however, CSI is acquired based on 
training sequences sent by user terminals. Due to limited time 
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and frequency resources, non-orthogonal pilot sequences are 
typically used by user terminals in neighboring cells, resulting 
in residual channel estimation error. This effect, called pilot 
contamination B 0’ has a detrimental impact on the actual 
achievable spectral and energy efficiencies in real systems. As 
a result, considerable research efforts have been spent in the 
last couple of years towards alleviating pilot interference in 
massive MIMO networks. 

Such techniques span from smart design of pilot reuse 
schemes (e.g. 0 ) to channel estimation techniques based 

on coordinated pilot allocation (e.g. 0, 0), to methods 
relying on multi-cell joint processing (e.g. |l^), to nonlinear 
channel estimation techniques leveraging on some fundamen¬ 
tal features of massive MIMO systems (e.g. 0, 

Two key features of massive MIMO channels that have been 
previously reported are of particular interest here: 1) channels 
of different users tend to be pairwise orthogonal when the 
number of antennas increases, thus leading to a specific 
subspace structure for the received data vectors that depend 
on these channels d and 2) the channel covariance ma¬ 
trix exhibits a low-rankness property whenever the multipath 
impinging on the MIMO array spans a finite angular spread 
0, d, |B). The blind signal subspace estimation in | fT^ 
capitalizes on the first property. The second property has been 
utilized in 0> d-GZl^ assuming the knowledge of the long¬ 
term channel covariance matrices. While the exploitation of the 
two properties individually has given rise to a set of distinct 
original decontamination approaches, in this work we will 
exploit these two key features in a combined manner. Doing 
so we can propose a novel approach towards mitigating pilot 
contamination that exhibits much higher levels of robustness. 

More specifically, in |12|, m). the pairwise channel or¬ 
thogonality property allows to blindly estimate the user-of- 
interest channel subspace and discriminate between user-of- 
interest signals and interference based on the channel powers. 
In practice, decontamination occurs via a projection driven 
by the channel amplitudes. This approach works well within 
the constraint that the interference channel is received with a 
power level sufficiently lower than that of the desired channel, 
a condition hard to guarantee for some edge-of-cell users. 

In a way completely different from another 

approach based on a linear minimum mean squared error 
(MMSE) estimator is adopted in 0 to estimate the channel 
of interest via projection of the received signals onto the user- 
of-interest subspace. This subspace, identified by a channel 
covariance matrix (a long-term one, as opposed to the instan¬ 
taneous signal correlation matrix of (T^, jTS)), is related to 
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the angular spread of the signal of interest Q and enables 
to annihilate the interference from users with non-overlapping 
domains of multipath angles-of-arrival (AoA). Interestingly, 
this latter approach makes no assumption on received signal 
amplitudes and can also discriminate against interfering users 
that are received with similar or even higher powers. Yet, the 
approach fails to decontaminate pilots when propagation scat¬ 
tering creates large angle spread, causing spatial overlapping 
among desired and interference channels. 

In this paper, we point out that the strengths of these 
two previously unrelated estimation methods are strongly 
complementary, offering a unique opportunity for developing 
robust channel estimation schemes. Thus, we aim to properly 
merge the two projections in complementary domains while 
keeping the individual benehts. In fact, we propose a family of 
algorithms striking various performance/complexity trade-offs. 

We start by presenting a hrst scheme named “covariance- 
aided amplitude based projection” that effectively combines 
projections in the angular and amplitude domains and ex¬ 
hibits robustness to interference power/angles overlapping 
conditions. We present an asymptotic analysis which reveals 
the conditions under which the channel estimation error due 
to pilot contamination and noise can be made to vanish. 
An intuitive physical interpretation of this condition for a 
Uniform Linear Array (ULA) is given in the form of the 
residual interference channel energy contained in the multipath 
components that overlap in angle with those of the desired 
channel. Although the physical explanation is given for the 
ULA example, the general principle apply to other antenna 
placement topologies. 

The obtained condition for decontamination is in general 
less restrictive than the condition required by previous MMSE 
and the amplitude projection-based methods taken separately 
to achieve complete removal of pilot contamination. 

We then propose two low-complexity alternative schemes 
called “subspace and amplitude based projection” and “MMSE 
+ amplitude based projection” respectively. Such schemes 
achieve different complexity-performance trade-off at mod¬ 
erate number of antennas. Specihcally, the “subspace and 
amplitude based projection” can be shown to reach asymptotic 
(in the number of antennas) decontamination result under the 
same channel topology conditions as the hrst scheme. 

More specihcally, our contributions are as follows: 

• We put forward a modihcation of the known method of 
amplitude based projection, with increased robustness. 

• We propose a spatial hlter which helps bring down the 
power of interference while preserving the signal of 
interest. With this spatial hlter, we present a novel channel 
estimation scheme called “covariance-aided amplitude 
based projection”. It combines the merits of linear MMSE 
estimator and amplitude based projection method, yet can 
be shown to have signihcant gains over these known 
schemes. 

• We analyze the asymptotic performance of this proposed 
method and provide weaker condition compared to the 
previous methods where the estimation error of the pro¬ 
posed method goes to zero asymptotically in the limit 
of large number of antennas and data symbols. The 


asymptotic analysis relies on mild technical conditions 
such as uniformly boundedness of the spectral norm of 
channel covariance. 

• As the uniformly boundedness of the largest eigenvalue of 
channel covariance was reported to be useful in previous 
works (such as |19|) but not formally analyzed, we 
identify in the case of ULA a sufficient propagation 
condition under which the uniformly bounded spectral 
norm of channel covariance is satished exactly. 

• Einally we propose two low-complexity alternatives of the 
hrst method. An asymptotic performance characterization 
is also given. 

The paper is organized as follows: In section|^we introduce 
the system model. Section |III] is a brief review of MMSE 
channel estimator and its asymptotic performance. In section 
|IV| we briehy recall the amplitude based projection of 
P8[, and we propose a hrst improvement of the method. 
Then we present the novel covariance-aided amplitude based 
projection in section |V-A| for the setting of single user per 
cell, and the asymptotic performance analysis of this method 
is shown in section V-B Section V-C| presents a generalization 
of the proposed scheme to multi-user per cell scenario. In 
section VI we propose two low-complexity alternatives of our 


previous method and similar asymptotic results on the system 
performance are given. Section |VII| shows numerical results. 


Einally section VIII concludes the paper. 

The notations adopted in the paper are as follows. We 
use boldface to denote matrices and vectors. Specihcally, Im 
denotes the M x M identity matrix. (X)^, (X)*, and (X)^ 
denote the transpose, conjugate, and conjugate transpose of 
a matrix X respectively. (X)^ is the Moore-Penrose pseu¬ 
doinverse of X. tr{ } denotes the trace of a square matrix. 
11-112 denotes the norm of a vector when the argument is a 
vector, and the spectral norm when the argument is a matrix. 
In particular, if A is a Hermitian matrix, ||A||2 is the largest 
eigenvalue of A. We index the eigenvalues of A in non¬ 
increasing order and denote the Lth eigenvalue of A by A^jA} 
and its corresponding eigenvector by ei{A}. Ij-H^ stands for 
the Erobenius norm. E{-} denotes the expectation. The Kro- 
necker product of two matrices X and Y is denoted by X® Y. 
vec(X) is the vectorization of the matrix X. diagjai,..., aN} 
denotes a diagonal matrix or a block diagonal matrix with 
ai, ...,aN at the main diagonal. = is used for dehnition. 


11. Signal and Channel Models 

We consider a network of L time-synchronizec0 cells, 
with full spectrum reuse. Each base station (BS) is equipped 
with M antennas. There are K single-antenna users in each 
cell simultaneously served by their base station. The cellular 
network operates in time-division duplexing (TDD) mode, and 
due to channel reciprocity, the downlink channel is obtained 
at the BS by uplink training signal and data signal. Each 

’ Note that assuming synchronization between uplink pilots provides a worst 
case scenario from a pilot contamination point of view, since any lack of 
synchronization will tend to statistically decorrelate the pilots. Furthermore, 
the main methods that we propose in this paper, i.e., the covariance-aided 
amplitude based projection and the subspace and amplitude based projection 
do not rely on accurate time synchronization. 











IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 64, NO. II, 2016 


3 


base station estimates the channels of its K users during a 
coherence time interval. The pilot sequences inside each cell 
are assumed orthogonal to each other in order to avoid intra¬ 
cell interference. However the same pilot pool is reused in 
other cells, giving rise to pilot contamination problem. The 
pilot sequence assigned to the A:-th user in a certain cell is 
denoted by 

Sfc ~ [ ^kl ^k2 ‘ ' ‘ ^kr ] ; (1) 


where r is the length of a pilot sequence. Without loss of 
generality we assume unitary average power of pilot symbols: 

„T „* _ f 0, fci 7^ k2 

^‘1 ^2 ki = k2 ' 

The M X 1 channel vector between the k-ih user located in 

(i) 

the l-ih cell and the j-th base station is denoted by The 
following classical multipath channel model pQ| is adopted; 

^ p—1 

■ (j) 

where P is an arbitrary large number of i.i.d. paths, and 
is their i.i.d. random phase, which is independent over channel 
indices I, k, j, and path index p. a{9) is the steering (or phase 
response) vector by the array to a path originating from the 
angle of arrival 9: 


a(0) = 


1 

g-f2ir^ cos(6l) 


(3) 


•o (M-l)D /n\ 
-5^— cos(@) 


where A is the signal wavelength and D is the antenna spacing 
which is assumed fixed. Note that we can limit 6 * to 0 G [0, tt] 
because any 9 G [— tt, 0) can be replaced by —9 giving the 
same steering vector. is the path-loss coefficient 


/3 


U) 

Ik 



(4) 


in which 7 is the path-loss exponent, is the geographical 
distance between the user and the j-th base station, and a is a 
constant. Note that the model is shown for a ULA example for 
ease of exposition. Under this model, the covariance matrix 
can be shown asymptotically to have low rank, as long as 
the AoA support is bounded and strictly smaller than [ 0 , 7 r]. 
However, several other channel models also exhibit similar 
low-rank property jTS) , which is the essential characteristic 
exploited by the MMSE estimator. Hence our approach is not 
dependent on the use of the one ring model above described. 
In fact, our main results, namely Theorem [T] as well as the 
general principle carry to other channel models and antenna 
placement topologies. 

We define 


fjU) A 

“ ["/I ^12 

• • • 

WaJ ’ 

(5) 

and the pilot matrix 



S = [si S 2 • • • 

sk \ ■ 

( 6 ) 


During the training phase, the received signal at the base 
station j is 


= '^uy>s + -n^^\ 


(7) 


1^1 


where G is the spatially and temporally white 

additive Gaussian noise (AWGN) with zero-mean and element¬ 
wise variance a^. Then, during the uplink data transmission 
phase, each user transmits C data symbols. The received data 
signal at base station j is given by: 

L 

-f , (8) 

where X; G is the matrix of transmitted symbols of 

all users in the Z-th cell. The symbols are i.i.d. with zero- 
mean and unit average element-wise variance. G 
is the AWGN noise with zero-mean and element-wise variance 
(T^. Note that the block fading channel is constant during the 
transmission for the r pilot symbols and the C data symbols. 


HI. MMSE CHANNEL ESTIMATION 

We briefly recall the MMSE channel estimator in a multi¬ 
cell setting with single-user per cell. Without loss of generality, 
we assume cell j is the target cell, and hp) G is 

the desired channel, while hp^ G 7 ^ j are the 

interference channels. We rewrite Q in a vectorized form, 

L 

y(f) = syphp^(9) 

where = vec{Y^^'>), nU) = vec(N('^p. A pilot sequence 
s is shared by all users. The pilot matrix S is given by 

S^S(g)lM. (10) 

We define the channel covariance matrices 

Rp^ = E{hp^hp^'^} G = (11) 

where the expectation is taken over channel realizations. 

( 7 ) 

A linear MMSE estimator for hy is given by 

hp)MMSE = Rp) Y RP^ + . (12) 

As shown in previous works Q, p3| , for a base station 
equipped with a ULA, the above MMSE estimator can fully 
eliminate the effects of interfering channels when M ^ 00 , 
under a specific “non-overlap” condition on the distributions 
of multipath AoAs for the desired and interference channels. 
This condition is formalized as follows. Assume the user in 
cell j is our target (desired) user. Denote the angular support 
of the desired channel as <I>d, (i.e., the probability density 
function (PDE) Pdi9) of the AoA of the desired channel hp^ 
satisfies Pd{9) > 0 if 0 G and Pd{9) = 0 if 0 ^ $(i) and 
similarly the union of the angular supports of all interference 
channels hppZ 7 ^ j) as <I>i. If = 0, then, as M —)■ c», 

( [T2] ) converges to an interference-free estimate. In practice the 
“non-overlap” condition is hard to guarantee and the finite-M 
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performance of the MMSE scheme depends on angular spread 
and user location, although the latter can be shaped via the use 
of so-called coordinated pilot assignment (CPA) Q. 


IV. Amplitude based projection 


Interestingly, angle is not the only domain where interfer¬ 
ence can be discriminated upon, as revealed from a completely 
different approach to pilot decontamination d). In that 
approach the empirical instantaneous covariance matrix built 
from the received data ([^ is exploited, in contrast to the use 
of long-term covariance matrices in ( [T^ . Assume cell j is 
our target cell and each cell has K users. The eigenvalue 
decomposition (EVD) of is written as 

(13) 


where 

matrix and = diag{A 


,(i) 


U) 

1 ) 


diagonal 


is a unitary 


entries sorted in a non-increasing order. By extracting the hrst 
K columns of i.e., the eigenvectors corresponding to the 
strongest K eigenvalues, we obtain an orthogonal basis 


eO') a 


,(i) 


, 0 ) 


e C 


MxK 


(14) 


The basic idea in 1121, fis) is to use the orthogonal basis 
as an estimate for the span of which includes all desired 
user channels in cell j. Then, by projecting the received signal 
onto the subspace spanned by most of the signal of 

interest is preserved. In contrast, the interference signal is 
canceled out thanks to the asymptotic property that the user 
channels are pairwise orthogonal as the number of antennas 
tends to inhnity. Thus after the above mentioned projection, 
the estimate of the multi-user channel is given by: 


g(i)AM _ 1-^ (-25) 

3 T 

Note here that interference and desired channel directions are 
discriminated on the basis of channel amplitudes and not AoA, 
hence the estimate is labeled “AM” for “Amplitude”. As a way 
to guarantee an asymptotic separation between the signal of 
interest and the interference in terms of power, it has been 
suggested to introduce power control in the network Gg,®. 


Remark 1 Generalized amplitude projection 

As shown in fl^ , fTS) , the above method works well when 
the desired channels and interference channels are separable in 
power domain, i.e., the instantaneous powers of any desired 
channels are higher than that of any interference channels. 
In practice however, this assumption is not always guaran¬ 
teed. Eor a hnite number of antennas, the short-term fading 
realization can cause the interference subspace to spill over 
the desired one. An enhanced version can somewhat mitigate 
this problem by considering a generalized amplitude based 
projection. This consists in selecting a possibly larger number 
of dominant eigenvectors to form E^-^^ where is 
the number of eigenvalues in that are greater than p.X^^\ 
p is a design parameter that satishes 0 < p, < 1. See section 
m for details on the choice of p. 


V. Covariance-aided amplitude based proiection 

Note that both previous methods, while being able to tackle 
pilot contamination in quite different ways, perform well only 
in some restricted user/channel topologies. Eor a ULA base 
station, the MMSE method leads to interference-free channel 
estimates under the strict requirement that the desired and in¬ 
terference channel do not overlap in their AoA regions. While 
the amplitude based projection requires that no interference 
channel power exceeds that of a desired channel to achieve a 
similar result. Unfortunately, due to the random user location 
and scattering effects, it is quite unlikely to achieve these 
conditions at all times. As a result, by combining the useful 
properties of both the MMSE and the amplitude projection 
method, we propose below novel estimation methods that will 
lead to enhanced robustness in a realistic cellular scenario. 


A. Single user per cell 

Eor ease of exposition we hrst consider a simplihed scenario 
where intra-cell interference is ignored by assuming that each 
cell has only one user, i.e., K = 1. The users in different 
cells share the same pilot sequence s. Then, with proper 
modihcations, we will generalize this method to the setting 
of multiple users per cell in section |WC 

The objective is to combine long-term statistics which in¬ 
clude spatial distribution information together with short-term 
empirical covariance which contains instantaneous amplitude 
and direction channel information. Hence, a spatial distribution 
hlter can be associated to an instantaneous projection operator 
to help discriminate against any interference terms whose 
spatial directions live in a subspace orthogonal to that of the 
desired channel. The intuition is that such a spatial hlter may 
bring the residual interference to a level that is acceptable to 
the instantaneous projection-based channel estimator. 

In order to carry out the above intuition, we introduce 
a long-term statistical hlter Hj, which is based on channel 
covariance matrices in a way similar to that used by the MMSE 
hlter in ( [T^ . 



Note that the linear hlter Sj allows to discriminate against 
the interference in angular domain by projecting away from 
multipath AoAs that are occupied by interference. Note also 
that the choice of spatial hlter is justihed from the fact 
that the full information of desired channel is preserved, 
as hy lies in the signal space of . In fact, the desired 
channel is recoverable using another linear transformation : 



as can be seen from the following equality 

J J J J J J J J 


(18) 


where the columns of Vj are the eigenvectors of R^^^ corre¬ 
sponding to non-zero eigenvalues. 
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The spatial filter is applied to the received data signal at 
base station j as 

= (19) 


The amplitude-based method as shown in section IV 


can now 


be applied on the filtered received data to get rid of the residual 
interference. Take the eigenvector corresponding to the largest 
eigenvalue of the matrix Wj'Wf /C: 


G,i=ei{lw,Wf}. 


( 20 ) 


Hence Uji can be considered as an estimate of the direction 


of the vector 3jh^ 


U) 


We then cancel the effect of the pre-multiplicative matrix 
Hj using H' in (17 1 , and we obtain an estimate of the direction 
of the channel vector h) as follows; 




H'uji 




( 21 ) 


Finally, the phase and amplitude ambiguities of the desired 
channel can be resolved by projecting the LS estimate onto 
the subspace spanned by Uji: 

( 22 ) 

where the superscript “CA” denotes the covariance-aided 
amplitude domain projection. 

The algorithm is summarized below; 


Algorithm 1 Covariance-aided Amplitude based Projection 


1: Take the first eigenvector of 'Wj'W^/C as in (20 1 , with 


Wj being the filtered data signal. 

2: Reverse the effect of the spatial filter using 0. 

3: Resolve the phase and amplitude ambiguities by ([2^. 


The complexity of this proposed estimation scheme is 
briefly evaluated. 

We note that the computation of the matrix inversions in 
(16 1 has a complexity order of (9(M^-^^). However, these 
computations are performed in a preamble phase and their 
cost is negligible under the underlying assumption of channel 
stationarity implicitly made in this article. In practical systems, 
the matrix inversion in (16i is performed when the channel 
statistics are updated. Since the channel statistics are typically 
updated in a time scale much larger than the channel coherence 
time, i.e., the time scale for the applicability of Algorithm 
then their computational cost is negligible. Therefore, we can 
focus on the complexity of Algorithm [T] only. 

In step 1, the spatial filtering of the data^ignals in ( [T^ 
and the computation of the covariance matrix is per¬ 

formed along with the computation of the dominant eigenvec¬ 
tor of an M X M matrix as in ( [20| . The former computation has 
a complexity order 0{CM^) while, by applying the classical 
power method, the computation of the dominant vector has 
a complexity order O(M^). Both step 2 and step 3 require 
multiplications of matrices by M-dimensional vectors and 
thus both have a complexity order 0{M'^). Then, the global 


complexity of the algorithm is dominated by the complexity 
of step 1, which is 0{CM'^). 

The ability for the above estimator to combine the advan¬ 
tages of the previously known angle and amplitude projection 
based estimators is now analyzed theoretically. In particular 
we are interested in the conditions under which full pilot 
decontamination can be achieved asymptotically in the limit 
of the number of antennas M and data symbols C. In order to 
facilitate the analysis, we introduce the following condition. 

Condition Cl: The spectral norm of R) is uniformly 
bounded; 


VM S Z"*" and V/ G {1,..., L}, 3^, s.t. 


R 


U) 


<C, (23) 


where Z+ is the set of positive integers and (j is a constant. 

Condition Cl can be interpreted as describing all the 
scenarios in which the channel energy is spread over a 
subspace whose dimension grows with M. Note that the same 
assumption can be found in some other papers, e.g., m- The 
corresponding physical condition is now investigated for the 
case of a ULA with a typical antenna spacing D (less than or 
equal to half wavelength). 


Proposition 1 Let $ be the AoA support of a certain user. Let 
p{0) be the probability density function of AoA of that user. 
If p{9) is uniformly bounded, i.e., p{0) < -|-oo,V0 G <i>, and 
$ lies in a closed interval that does not include the parallel 
directions with respect to the array , i.e., 0, tt ^ <I>, then, 
the spectral norm of the user’s covariance R is uniformly 
bounded. 

Proof: See Appendix [A| □ 

Note that this result is hinted upon GD by resorting to 
approximation of R by a circulant matrix. Our Proposition [T] 
here gives a formal proof of the previous approximated result. 

As another interpretation of Condition Cl, it is worth noting 
that when this condition is not satisfied, there is no guarantee 
that the asymptotic pairwise orthogonality of different users’ 
channels holds. In other words, the quantity I ^ 

j may not converge to zero, which is an adverse condition for 
all massive MIMO methods. However, our proposed methods 
still have significant performance gains under this adverse 
circumstance. Moreover, Cl is a sufficient condition and we 
believe it can be weakened. 


B. Asymptotic performance of the proposed CA estimator 

We now look into the performance analysis of the proposed 
estimation scheme. Let us define 

tr{HjR[^)Hf}, V( = 1,..., L. (24) 


Theorem 1 Given Condition Cl, if the following inequality 
holds true 


aj 


U) 


> j, 


(25) 
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then, the estimation error of \22\ vanishes, i.e. 


C0)CA _ 


2 

J 


3 

2 


hP 

2 



3 

2 



lim 

M,C^oo 


Proof: For the sake of notational convenience, in this proof 
we assume the user in cell j is the target user and thus drop the 
superscript (j). The desired channel is denoted by hj = 
and the interference channels are h; = ^ ^ j- Since 

hi, I = 1,... ,L, is considered as M x 1 complex Gaussian 
with the spatial correlation matrices R; = E{h;hP}, the 
channels can be factorized as © 

hi=Ry\^i,l = l,...,L, (27) 

where hw; ~ CA/'(0, Im), is an i.i.d. M x 1 Gaussian vector 
with unit variance. We build the proof of Theorem on 
the general correlation model ( |27] l. The proof consists in 
three parts, corresponding to the three steps in Algorithm [T] 
respectively. More specifically, Lemma[^(and the intermediate 
results towards Lemma[T]) is the first part of the proof. It shows 
that Uji aligns asymptotically with the direction of the filtered 
channel vector = Sjhj. The second part of the proof is 
provided in Lemma which proves that after canceling the 
effect of the spatial filter using H', we obtain the direction of 
the true channel in Uji. The final part of the proof shows 
that by projecting the LS estimate onto the subspace of Uji, 
we resolve the phase and amplitude of the true channel. 


i.e.. 

and 

q^AMr^O, 

= 0. 

(26) where 

denotes almost sure convergence. 


Lemma 1 Given Condition Cl, if ^ j, then 

there exists a unique 0 < f < 2 tt, such that 


„(j) 


lim 

M.C-I-oo 




IhjI 


- Wie- 




= 0 . 


(28) 


where h; = H^ h;, Z = 1,..., L. 

Proof: The proof of Lemma [T] relies on several intermediate 
results, namely Lemma |^- Lemma 


Lemma 2 Under Condition Cl, the spectral norm of 
satisfies: 


H 


M 11 




(29) 


Proof: See Appendix [B] □ 

Lemma indicates that the spectral norm of the covariance 
of the noise (after multiplying Sj) is bounded and does not 
scale with M. This conclusion will be exploited when we 
prove in Lemma that the impact of noise on the dominant 
eigenvector/eigenvalue vanishes. 


Lemma 3 ^22^ Let Am be a deterministic M x M complex 
matrix with uniformly bounded spectral radius for all M. Let 
:<1 m\^ where = 1, • • • ,M is i.i.d. 




q = 

complex random variable with zero mean, unit variance, and 
finite eighth moment. Let r be a similar vector independent of 
q. Then as M ^ oo. 








(30) 


( 31 ) 


Note that in this paper, the condition on the finite eighth 
moment always holds, as when we apply Lemma the com¬ 
ponents of the vector of interest are i.i.d. complex Gaussian 
variables. It is well known that a complex Gaussian variable 
with zero mean, unit variance has finite eighth moment. 


Lemma 4 Given Condition Cl 
1 


lim ^hfh; = 0,VZ^j 
M^oo M~^ 


1 


lim -^hfhi = ai,l = l,...,L. 

M—>-oo IVl 


Proof: See Appendix [C] 

Lemma 5 When Condition Cl is satisfied. 


lim 

M.C->oo 


MC 


H 




1^2-1 


= 0 , 


(32) 

(33) 

□ 

(34) 


□ 

oo, aj is an asymptotic 


Proof: See Appendix [D| 

Lemma proves that as M, C 
eigenvalue of the random matrix 'Wj'W^/MC, with its 
corresponding eigenvector converging to h;j 7 ||hj ||2 up to a 
random phase. 

We now return to the proof of Lemma [TJ Since aj > 
ai,yi 7 7 oue may readily obtain from Lemma and (32i 
that _ „ 


lim Ai 
M,C—foo 


WjWf 

MC 




and that there exists a unique 0 < 7 < 27r, such that 


lim 

M.C->oo 



fw.wf 1 


c, 

l|h,L 

MC 

> 

\ 


= 0 , 


(35) 


(36) 


which completes the proof of Lemma □ 

Now we show the second part of the proof of Theorem 
Note that in this part we make the implicit assumption that 
the spectral norm of H' satisfies ||3'< +(X). A sufficient 
(but not necessary) condition of such an assumption is that the 
spectral norm of is finite. 


Lemma 6 Given 


we have 


lim 

M,C->oo 


- u.7 ie' 


j4> 


‘■3 II 2 


= 0 . 


(37) 


Proof: See Appendix]^ □ 

The final part of the proof of Theorem [T] can be found 
in Appendix which corresponds to step 3 of Algorithm 
The proof shows that projecting the LS estimate onto the 
subspace of Uji will lead to noise-free estimate asymptotically 
as M, C —i' oo. This concludes the proof of Theorem □ 
Interestingly, condition (j2^ in Theorem [T] can be replaced 
with 




0) 


> 




U) 




(38) 
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which indicates that under suitable conditions on the spectral 
norm of channel covariance, after multiplying the hlter Sj, 
if the power of the desired channel is higher than that 
of interference channel, then, pilot contamination disappears 
asymptotically, along with noise. 

Note that we have so far no assumption on antenna place¬ 
ment in the analysis, other than the requirement for uniformly 
boundedness of the spectral norm of channel covariance. In the 
sequel we look into a specihc model of ULA as an example 
and seek to further understand the physical meaning of the 
proposed method. 

We still assume is the channel of interest. Denote its 
angular support as Decompose the interference channel 


^ j, as follows: 


u(7) _ v^(7) , j^(7) 

W — “li + Wo > 

(39) 

where 


qO 

^ E 

(40) 

^0 = ^ E 

(41) 


which means is the residual multipath component of 
the interference channel within the AoA region ihd of the 
desired channel, while hfj is the multipath component which 
is outside 


Theorem 2 For a ULA base station, under Condition Cl, if 
the residual multipath component of the interference channel 
satisfies: 




“7 'h, 


( 7 ) 


< 




(42) 


then, the estimation error of the estimator 

2 


vanishes: 


lim 

M,C^oo 


^,)CA _ 


-( 7 ) 


= 0 . 


(43) 


Proof: See Appendix [G| □ 

Theorem |2] further conhrms the fact that for a base station 
equipped with ULA, only the interference multipath compo¬ 
nents that overlap with those of the desired channel affect the 
performance of our pilot decontamination method. In other 
words, the spatial hlter Sj removes the energy located in all 
interference multipath originating from directions that do not 
overlap with those of the desired channel. It is then sufficient 
for the energy of the residual interference components to 
be below that of the desired channel to allow for a full 
decontamination. 


C. Generalization to multiple users per cell 

Now we generalize the covariance-aided amplitude based 
projection into multi-user setting where K users are served 
simultaneously in each cell. We consider the estimation of 
user channel hjf in the reminder of this section. 


(?) (?) 

Dehne a matrix H . as a sub-matrix of H) after remov- 

j\k 7 


ing its A:-th column. 


7\fc 


( 7 ) 


UU 

“.IT 


Aj) 


,( 7 ) 


lik-l) j{k+l) 


A corresponding estimate of (|44|l, denoted by is ob- 


Aj) 

fO) 


■ (44) 


tained by removing the fc-th column of which can be 

an LS estimate, MMSE estimate, or other linear/non-linear 
estimate of For demonstration purpose only, in this paper 
we use the simplest LS estimate, which already shows very 
good performance. 

In order to adapt the method in section 
user scenario, we propose to hrst neutralize the 
interference with a Zero-Forcing (ZF) hlter Tjk based on the 
LS estimate and then apply the spatial hlter Sjk- After 

these two hlters, the data signal is now 

T,fcW«, 


V-A 


to multi- 
intra-cell 


^ H 


'jk ^jk 


(45) 


where 


and 


T,, , (46) 

R«. (47) 


The rest of this method proceeds a^n the single user setting. 
Take the dominant eigenvector of 


Ujfei =ei{-W^fcW^-J. 

The estimate of the direction of is obtained by 


(48) 




^'jk^jkl 


(49) 


where 



(50) 


Finally the phase and amplitude ambiguities are resolved by 
the training sequence, and we have the estimate of 

(51) 

Note that in this method, we build the ZF type hlter Tjk based 
on a rough LS estimate. Further improvements can be attained 
with higher quality estimates at the cost of higher complexity. 
As a simple example, we can reduce the effect of noise on 
the estimate by hrst applying FVD of /C, 

then removing the subspace where the noise lies, and hnally 
performing LS estimation. These extensions are out of the 
scope of this paper. 


VI. Low-complexity alternatives 

In this section, we propose two alternatives of the method 
shown in section |V] aiming at lower computational complexity 
at the cost of mild performance losses. 
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A. Subspace and amplitude based projection 

The low-rankness of channel covariance implies that the 
uplink received desired signal lives in a reduced subspace. 
By projecting the received data signal onto the signal 

space of Rjfc , we are able to preserve the signal from user k 
in cell j while removing the interference and noise that live 
in its complementary subspace. In the following, we show a 
subspace-based signal space projection method that relies on 
the covariance of desired channel only. For ease of exposition, 
we simplify the system setup to single user per cell. Let the 
user in cell j be the target user. The EVD of the covariance 
of the desired channel is 

Rf = V, S, Vf, (52) 

where the diagonal entries of Sj contains the non-negligible 

eigenvalues of Then we project the received data signal 

■' (j) 

onto the signal space of R} , or the column space of V^: 

Wj (53) 

The rest of this method follows the same idea as the 
covariance-aided amplitude based projection scheme. Taking 
the eigenvector corresponding to the largest eigenvalue of 

W,Wf/C7: 

u,i=ei{iw,Wf|, (54) 

the channel estimate of is given by 

hf(55) 

where the superscript “SA” stands for “subspace and ampli¬ 
tude based projection”. Note that this method does not require 
the covariance of interference channels or variance of noise. It 
explicitly relies on the assumption that the desired covariance 
matrix has a low-dimensional signal subspace, with some 
degradations expected when this condition is not realized in 
practice. In fact, if Rj'^^ has full rank, this method degrades 
to pure amplitude based projection. 

Note that this “SA” estimator has lower complexity than 
the “CA” estimator ( [22l i in the sense that 1) “SA” estimator 
does not require the statistical knowledge of the interference 
channels or the variance of the noise, and 2) “SA” estimator 
skips step 2 in Algorithm 

The physical condition under which full decontamination 
is achieved with this method is shown below in the case of 
a ULA. We denote the angular support of desired channel 
by and the multipath components of the interference 
channel falling in as hp\ 


then, the estimation error of the estimator vanishes 


lim 

M,C->oo 




Aj) 


= 0 . 


( 58 ) 


Proof: Due to lack of space, we skip the complete proof and 
only give two key steps below. By applying the asymptotic or¬ 
thogonality between two steering vectors which are associated 
with different AoAs ( Lemma 3 in Q), we may readily obtain 

1 ,, tG 1 ,Sl) (59) 




lim 

M—fCX. 

lim 


Vm 


1 




1 


Aj) 


Vm 


^ J, 


(60) 


which means the multipath components of interference that 
fall outside disappear asymptotically after the projection 
by Vj V^. Then, equation (57 1 ensures that 


lim =0,( 


(61) 


where 


hp) 4 VjVfhpPz = 1,...,L. 


(62) 


□ 

Note that in Theorem condition (|57| is less restrictive than 
the uniformly boundedness of the spectral norm of the channel 
covariance. In the special case of zero angular spread, the rank 
of channel covariance becomes one. Denote the deterministic 

— (j ) 

AoA from the user in cell I to base station j as 6i . We can 
easily see that the channel estimation error of ( |55| ) vanishes 
completely as M, C —t oo as long as 

(63) 


which occurs with probability one. 

When channel covariance is not available, we can still 
benefit from the subspace projection method by approximating 
Vj with a subset of discrete Fourier transform (DFT) basis as 
shown in ||T|. This DFT basis can be chosen based on a small 
number of channel observations. The generalization to multi¬ 
user case can be done by introducing the ZF filter (461 as in 
section 


V-C Due to lack of space, we skip the details. 


B. MMSE + amplitude based projection 

Another alternative is to directly project the MMSE estimate 
onto the subspace of E('^( obtained by EVD of /C 


as in section 
is given by 


IV 


The estimator for the multi-user channel H 


U) 


Theorem 3 For a ULA base station, if the power of interfer¬ 
ence channel that falls into the angular support satisfies 


hi 


U) 


< 


^ J, 

and the channel covariance satisfies 
VM G Z+,V( ^ j, 


AJ) 


Rp^^VjVfRp^5 


< + 00 , 


(i)MA ^ ( t(Y^ Rpp -f all 


(56) 


(57) 


H 


where 


TT(2)^ 


KM 


s 


I Im = [ Si 


1^1 


E^"' (g)E(^'P 


Lm • • • Sic I 


J-M 




(64) 

(65) 

( 66 ) 
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and 


= diag{Rp^ 


T? 

11 ’ ^IK 




(67) 


The superscript “MA” denotes MMSE + amplitude based 
projection. It is worth noting that both the amplitude-based 
projection and angular-based projection require a large number 
of antennas to achieve complete decontamination. In contrast, 
the MMSE estimator is efficient with very small number of 
antennas. As M grows, MMSE estimator starts to reduce 
interference earlier than the previously proposed methods, as 


will be shown by simulations in Section VII However, unlike 


the previously proposed schemes, this “MA” estimator can¬ 
not achieve complete decontamination when the interference 
channel is overlapping with desired channel in both angular 
and power domains. 

VII. Numerical Results 

This section contains numerical results of our different 
channel estimation schemes compared with prior methods. In 
the simulation, we have multiple hexagonally shaped adjacent 
cells in the network. The radius of each cell is 1000 meters. 
Each base station has M antennas, which forms a ULA, with 
half wavelength antenna spacing. The length of pilot sequence 
is T = 10 . 

Two performance metrics are considered. The first is the 
normalized channel estimation error 


= —FF 

KT , ^ 


K 


KL 


2 = 1 k=l 


/ 

C(2) _ u(i) 

"yfe "yfc 


V 

h(2) 

2 1 

2 / 


( 68 ) 


The estimation errors in the plots are obtained by Monte Carlo 
simulations and displayed in dB scale. 

The second metric is the uplink per-cell rate when MRC 
receiver (based on the obtained channel estimate) is used at 
the base station side. 

In all simulations presented in this section, we assume that 
the channel covariance matrix is estimated using 1000 exact 
channel realizations. The multipath angle of arrival of any 
channel (including the interference channel) follows a uniform 
distribution centered at the direction corresponding to line-of- 
sight (LoS). The number of multipath is P = 50. According 
to the coherence time model in for a mobile user moving 
at a vehicular speed of 70 km/h in an environment of 2.6 GHz 
carrier frequency and 5ps high delay spread (corresponding to 
an excess distance of 1.5 km), the channel can be assumed 
coherent over 500 transmitted symbols. Thus, we will let 
C = 500 in simulations, although larger coherence time can 
be expected in practice for a user with lower mobility. 

Note that in all simulations, the amplitude-based projection 
and MMSE + amplitude based projection follow the enhanced 
eigenvector selection strategy shown in Remark with the 
design parameter /r = 0 . 2 . 

We first illustrate Theorem [T] in Eig. [T] Suppose we have 
a two-cell network, with each cell having one user. In order 
to make the interference overlapping in power domain with 
the desired signal, we set the path loss exponent 7 = 0 . The 
power of the interference channel has equal probability to be 
higher or lower than the power of the desired channel. The 


user in each cell is deliberately put in a symmetrical position 
such that the multipath angular supports of the interference 
and the desired channel are half overlapping with each other. 



Fig. 1. Estimation performance vs. M, 2-cell network, 1 user per cell, path 
loss exponent 7 = 0 , partially overlapping angular support, AoA spread 60 
degrees, SNR = 0 dB. 


In the figure, “LS estimation” and “Pure MMSE” denote 
the system performances when an LS estimator and an MMSE 
estimator ( [T^ are used respectively. “Pure amplitude” denotes 
the case when we apply the generalized amplitude based 
projection method only. “MMSE + amplitude” represents 
the proposed estimator (64i. “Covariance-aided amplitude” 
denotes the proposed covariance-aided amplitude based pro¬ 
jection method ( |22) i. The curve “MMSE - no interference” 
shows the estimation error of an MMSE estimator in an 
interference-free scenario. As can be seen from Eig. [T] due 
to the overlapping interference in both angle and power 
domains, the performance of all estimators saturate quickly 
with the number of antennas, except the proposed covariance- 
aided amplitude based projection method, which eventually 
outperforms interference-free MMSE estimation]^ 

In Eig. 1^ and Eig. we show the performance of estimation 
error and the corresponding uplink per-cell rate for a 7-cell 
network, with single user per cell. The users are assumed to 
be distributed randomly and uniformly within their own cells 
excluding a central disc with radius 100 meters. The angular 
spread of the user channel (including interference channel) 
is 30 degrees. The path loss exponent is now 7 = 2. As 
we may observe, the traditional LS estimator suffers from 
severe pilot contamination. The pure amplitude based method 
and the pure MMSE method alleviate the pilot interference, 
yet saturate with the number of antennas. These saturation 
effects come from the overlapping of the interference and the 
desired channels in power and angular domains respectively. 
The “MMSE + amplitude” approach outperforms these two 
known methods as it discriminates against interference in both 
amplitude and angular domains. However this scheme cannot 


^The reason is that the performance of the interference-free MMSE esti¬ 
mation has a non-vanishing lower bound due to white Gaussian noise. On the 
contrary, our proposed covariance-aided amplitude based projection method 
eliminates the effects of noise and interference asymptotically. 
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Fig. 2. Estimation performance vs. M, 7-ceII network, one user per cell, 
AoA spread 30 degrees, path loss exponent 7 = 2, cell-edge SNR = 0 dB. 



Fig. 3. Uplink per-cell rate vs. M, 7-ceII network, one user per cell, AoA 
spread 30 degrees, path loss exponent 7 = 2, cell-edge SNR = 0 dB. 

cope with the case of overlapping in both domains. Owing 
to its robustness, the covariance-aided amplitude projection 
method outperforms the rest in terms of both estimation error 
and uplink per-cell rate. 

We now turn our attention to multi-cell multi-user scenario. 
Fig. 0 and Fig. show the channel estimation performance 
and the corresponding uplink per-cell rate for a 7-cell network 
with each cell having 4 users. In these two hgures, we add the 
curve of subspace and amplitude based projection, which is 
denoted in the hgures as “Subspace + amplitude”. The other 
parameters remain unchanged compared with those in Fig. 

and Fig. We can notice that in Fig. the covariance- 
aided amplitude projection method has some performance 
loss with respect to the low-complexity MMSE + amplitude 
method and the MMSE method when the number of antennas 
is small. It is due to the following two facts; 1) when M 
is small, it is well known that MMSE works well, but not 
the amplitude based methods, and 2) with small M, the 
asymptotical orthogonality of channels of different users is not 
fully exhibited, and consequently a small amount of signal of 
interest is removed by the ZE hlter Tkj, along with intra-cell 
interference. However it is not disturbing in the sense that 1) as 


the number of antennas grows, the covariance-aided amplitude 
projection method quickly outperforms the other methods; and 
2) The per-cell rate of this proposed method is still good even 
with moderate number of antennas, e.g., M > 25. It is also 
interesting to note that the low-complexity alternative scheme, 
subspace and amplitude based projection method, has some 
minor performance loss, yet keeps approximately the same 
slope as the covariance-aided amplitude projection. 



Fig. 4. Estimation performance vs. M, 7-ceII network, 4 users per cell, AoA 
spread 30 degrees, path loss exponent 7 = 2 , cell-edge SNR = 0 dB. 



Fig. 5. Uplink per-cell rate vs. M, 7-ceII network, 4 users per cell, AoA 
spread 30 degrees, path loss exponent 7 = 2 , cell-edge SNR = 0 dB. 

VIII. Conclusions 

In this paper we proposed a series of robust channel esti¬ 
mation algorithms exploiting path diversity in both angle and 
amplitude domains. The hrst method called “covariance-aided 
amplitude based projection” is robust even when the desired 
channel and the interference channels overlap in multipath 
AoA and are not separable just in terms of power. Two 
low-complexity alternative schemes were proposed, namely 
“subspace and amplitude based projection” and “MMSE H- 
amplitude based projection”. Asymptotic analysis shows the 
condition under which the channel estimation error converges 
to zero. 
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Appendix 

A. Proof of Proposition 

Denote the associated path loss as f 3 . The covariance R is 
a Toeplitz matrix, with its mn-th entry given by 


R(r 


i) = l 3 f ( 6 - 

Jo 

1 


/3A 

2ttD 

-/ 


■'A" p(arccos(^)) 

tD 


J{n-m)x 


dx, 


where 


/(^) = 


/3A p ( arccos(^)) 
1 _ ( 

^ \2tjD) 


D 


(70) 


(71) 


2t:D 2ttD 

f(x) < + 00 ,-^ <x< -j—. 


(72) 


Theorem 4 


/|24|? If a|,”^ < < 


< are 


1^ we 


obtain that lim IjRIU = Mf < 


M—fOO 


-1 


are both positive 


S2ll2< 


R; + CT^Im 




R 


■3 112 


< 


c 


It is straightforward to show that 




Il2 “ Il“2ll2 < ^4 ’ 


(73) 


(74) 


C. Proof of Lemma 


Using the spatial correlation model (27 1 , we may write 


Thfh, = Th»^R|sfS,Rfh 


wz- 


(75) 


=‘ M" 

By an abuse of notation, we now use the operator Ai{-} to 
represent the largest singular value of a matrix. Appealing to 
the singular value inequalities in ph] , we can show that the 
maximum singular value of RJHySjR;^ yields 


Ai{RjHfH,Rf } < Ai{R|}Ai{Hf H,Rf } 

<C^Ai{HfH,}Ai{Rh 


^ ^4’ 


(76) 

(77) 

(78) 


Since 0, tt ^ $, or in other words, p(0) = p{tt) = 0, and that 
p{0) < 00 , V0 € $, it follows that f{x) is uniformly bounded: 


which means the spectral radius of the complex matrix 
R| is uniformly bounded for any M. Thus, accord¬ 

ing to Lemma |3[ ]ghyhz,VZ f j, converges almost surely to 
zero. Thus p2[) holds true. In a similar way, we can prove 
(33l. This concludes the proof of Lemma □ 


D. Proof of Lemma 
Dehne 


Thus, the Toeplitz matrix R is related to the real integrable and 
uniformly bounded generating function f{x), with its entries 
being Fourier coefficients of /(x). We now resort to the known 
result on the spectrum of the n x n Toeplitz matrices T„(/) 
dehned by the generating function /(x). Denote by ess inf and 
ess sup the essential minimum and the essential maximum of 
/, i.e., the inhmum and the supremum of / up to within a set 
of measure zero. Let ut/ = ess inf/ and Mf = ess sup/. 


4 lim 

( Wf ) 

(79) 

C —^00 

\C ^ ^ J 

= 

+ X] h/hr + • 

(80) 





In this proof, we hrst consider the noise free scenario and let 

( 81 ) 


Tnf - hjhf 


iAj 


the eigenvalues of T„(/), then, the spectrum of T„(/) is 
contained in {mf,Mf); moreover lim aI"^ = m/ and 

n—foo 

lim A^"_y^ = Mf. 

By invoking Theorem 


where the subscript “nf ’ denotes noise free. We can then write 

Tnf hj 


lim 

M—fOO 




M 


liiji 


Tnf h, 


^3 


(82) 


2 

)^( 


H I Tnf 


^3 


oo. In addition, for any finite M, the inequality IIRII 2 < 00 
always holds true. This concludes the proof □ 


-3 112 


- 3 112 
2 a i 


M h 


B. Proof of Lemma ^ 

Since R^ and ^ 

semi-dehnite (PSD) Hermitian matrices, we can directly apply 
the inequalities of | |25) on the eigenvalues of the product of 
two PSD Hermitian matrices 

X -1 


|h-||^— lim 
I—2 II 2 M^oo M 

a? 




3 112 
2 


- 3 II 2 


= lim ( , 

M—foo M 

= lim —^ 

= 0 , 

which proves that when M 00 , an eigenvalue of the 
random matrix Tnf/M converges to aj, with its corresponding 
eigenvector converging to h^/llh^H^ up to a random phase. 
Then we consider the Hermitian matrix as a 


perturbation on Tnf/M. Due to the Bauer-Fike Theorem |27| 


on the perturbation of eigenvalues of Hermitian matrices, 
together with Lemma we have for 1 < i < L: 


lim 

M—¥ 0 C 




which indicates that the spectral norm of is also 

uniformly bounded. This proves Lemma □ 


< lim 

M-j-oo M '' ■' ^ ' 

= 0 . 


(83) 

(84) 

(85) 
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The above result shows that the impact of the perturbation 
on the eigenvalues of Fnf/M vanishes as M ^ oo. In other 
words, aj is again an asymptotic eigenvalue of T /M. Now we 
verify that despite the perturbation, the eigenvector of T /M 
corresponding to the asymptotic eigenvalue aj also converges 
to hj/llhjll^ up to a random phase. To prove this, it is 
sufficient to show that 


lim 

M—>-oo 


< lim 

M—¥qo 

^=^0, 


r 

^3 

M 

h2ll2 

r„f 

> 

M 

11^2-11 




— a-, 




-J \\2 


0-2 h- 


M 


-0 Il2 


where (a) is due to the definition of the spectral norm 

= 0 . 


lim 

M—>-oo 



^2 

M 

1*72 112 


It follows that 


lim 

M,C->oo 


w,wf h^. 


MC 


1^2 112 


— a,- 


1—2 Il2 


= 0 , 


which concludes the proof of Lemma 

£. Proof of Lemma 
We can derive 


= lim 

M,C->oo 


< lim 

M,C^oo 


\ II h. 


— Mjie- 


.20 . 


Ujie- 


.3<l> 


II “2 "21II 2 




“>21 


lim 


l“>2ie^'1 


'2 “21II 2 


= 1 


M,C^oo llH'Ujill^ 

(91) 


( 86 ) 


In a similar way, we can prove that 


lim < 1. 


M,C-)-oo 


—j 


Combining ( |M] l and ( |92] i, we obtain 

-3 “2 “2"2ll^ 


(87) 


lim 


M,c^oo ||H'h^.|y|H'u,il|^ 
With analogous derivation, we can prove 


= 1. 


( 88 ) 


□ 


lim 


2l 2 2—2 

M,ci—>00 IlH^h ll ||H^U,-iI| 

11 2 —2 11 2 11 2 2111 2 


= 1. 


Applying (93 i and (94 1 to (891 gives 


lim 

M,C->oo 


3>2 


H'uj-ie- 


J4> 






'2 21 11 2 


= 0. 


The following equality holds 


(92) 


(93) 


(94) 


(95) 


lim 

M,C->oo 


= lim 

M.C-^oo 


3>2 

[I] 

ll“ 2—2 II 2 

l|3'u„ i II 

II 2 2 J -112 

( 

H'n ip20 

(l|s;!!jk 

Il“>2l|l2 

43 T 

[if 


J “>2 2 

Il“2"2l|l2; 


H 


= 2 — lim 

M,C->oo 




|32h2ll2 


l“>2l| 


+ 1 




L “>JT 


We treat the following quantity separately 


lim 

M,C->oo 




—2 2 2 




= lim 
M,C->oo 


= lim 

M,C->-oo 


■ Ujie- 


.20 _ 


L “>JT 


“^■|>L 


I “2 "21 


(89) 


H'hj = H'Hjhj = R^Rjhj = h^, 


proving that 


lim 

M,C->.oo 


_u-iei'^ 

Lll, 


= 0 , 


which completes the proof of Lemma 

F. Proof of Theorem 

From (|J7]) we readily obtain 


lim 


hfu2i 


= 1 . 


M,C^oo ||hj||2 

Recall from the uplink training Q, we have 


(90) 


= ^UjiU^ I lijs"^ + ^ his^ + N j s* 


and hence 


lim 

M,C->oo 


hf -h, 


(96) 

(97) 

□ 


(98) 


(99) 


( 100 ) 


* 2 II 2 
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= lim ' ^ 

M,C-)-oo 


(hf -h,)^(hCA-h, 


*2 II 2 


= lim 

M,C^od ||h 
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■2 II 2 
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) 


= lim 

M,C-)-oo ||h 

Equation 
lim 


1 


(hf hj - hf Ujiufihj ) . 


■2 II 2 

ensures that 
1 


„ h:f u„-i h, = 

\\hA\l ^ ^ ^ 


^hfhj = 1, 


*2 II 2 


*2 II 2 


which concludes the proof. 
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( 102 ) 

□ 


G. Proof of Theorem 

This proof follows similar steps towards Theorem [T] Thus 
we give a sketch of the proof only. Define 
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r = lim ( -WjW^ 
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C—foo y C 
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1^3 


_1_ TsH 

~r ’ 


(103) 

(104) 


(?) A (1) 

where h) = Sjh.1 ,l = 1,...,L. Due to the asymptotic 
orthogonality between steering vectors in disjoint angular 
support, i.e., Lemma 3 in Q, we can easily show that in large 


antenna limit, falls into the null space of Thus 

lim lim (105) 

M-s-oo M-i-oo M“‘® 

Then we have 


(2) 


lim — 
M—>-oo M 




(2)v,(2)ff 
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/ ^ —h —It 

1^3 


Under Condition Cl, it is easy to show that 


lim 

M—fCC) 


r 

hf 


h® 


M 


M 




—3 

2 

—3 

2 


^2-7 


= 0. (106) 


Given the following condition 

yi^j 


“2 


< 


“2“j 


(107) 


it is clear that the dominant eigenvector of T /M converges to 
(up to a random phase), with its corresponding 




(j) 


eigenvalue converging to /M. Then, using the same 

technique in the proof of Lemma we obtain 


ht- 

lim 

M,C->oo 


U,i 


,(2) 


= 1 . 


(108) 


Linally, we readily obtain ( [43] l by analogous derivations in 
Appendix □ 
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